Frequent Pattern Mining under Multiple Support Thresholds
نویسندگان
چکیده
Traditional methods use a single minimum support threshold to find out the complete set of frequent patterns. However, in real word applications, using single minimum item support threshold is not adequate since it does not reflect the nature of each item. If single minimum support threshold is set too low, a huge amount of patterns are generated including uninteresting patterns. On the other hand if it is set too high, many of interesting patterns (called rare items) may be lost. Recently, several methods have been studied to tackle the rare item problem by avoiding using single minimum item support threshold. The nature of each item is considered where different items are specified with different Minimum Item Support thresholds (MIS) instead of using single support threshold. By this, the complete set of frequent patterns is generated without creating uninteresting patterns and losing substantial patterns. In this paper, we propose an efficient method, Multiple Item Support Frequent Pattern growth algorithm, MISFP-growth, to mine the complete set of frequent patterns with multiple item support thresholds. In this method, Multiple Item Support Frequent Pattern Tree, MISFPTree, is constructed to store all crucial information to mine frequent patterns. Since the construction of the MISFP-Tree is done with respect to minimum of MIS; post pruning and reconstruction phases are not required. In order to show the efficiency of the proposed method, it is compared with a recent tree-based algorithm, CFPgrowth++ and various experiments are conducted on both real and synthetic datasets. Experimental results reveal that MISFP-growth outperforms in terms of execution time and memory space while we vary MIS values of items. Key-Words: Association rule mining, Frequent patterns, Rare itemsets, Multiple support thresholds Acknowledgements: This work is partially supported by the Scientific and Technological Research Council of Turkey (TUBITAK) under ARDEB 3501 Project No: 114E779
منابع مشابه
Multiple Minimum Support-Based Rare Graph Pattern Mining Considering Symmetry Feature-Based Growth Technique and the Differing Importance of Graph Elements
Frequent graph pattern mining is one of the most interesting areas in data mining, and many researchers have developed a variety of approaches by suggesting efficient, useful mining techniques by integration of fundamental graph mining with other advanced mining works. However, previous graph mining approaches have faced fatal problems that cannot consider important characteristics in the real ...
متن کاملTFP-growth: An Efficient Algorithm for Mining Frequent Patterns without any Thresholds
Conventional frequent pattern mining algorithms require some user-specified minimum support, and then mine frequent patterns with support values that are higher than the minimum support. As it is difficult to predict how many frequent patterns will be mined with a specified minimum support, the Top-k mining concept has been proposed. The Top-k Mining concept is based on an algorithm for mining ...
متن کاملPreference-Based Frequent Pattern Mining
Frequent pattern mining is an important data mining problem with broad applications. Although there are many in-depth studies on efficient frequent pattern mining algorithms and constraint pushing techniques, the effectiveness of frequent pattern mining remains a serious concern: it is non-trivial and often tricky to specify appropriate support thresholds and proper constraints. In this paper, ...
متن کاملEfficiently Mining Frequent Closed Itemsets by Eliminating Data Redundancies
Recently, data mining has been applied in business information and intelligence systems for discovering interesting patterns and knowledge to support decision making processes. One of the most basic and important tasks of data mining is the mining of frequent itemsets, which are sets of items frequently purchased by customers. Many methods have been proposed for this problem. However, mining th...
متن کاملDiscovering Periodic-Frequent Patterns in Transactional Databases
Since mining frequent patterns from transactional databases involves an exponential mining space and generates a huge number of patterns, efficient discovery of user-interest-based frequent pattern set becomes the first priority for a mining algorithm. In many real-world scenarios it is often sufficient to mine a small interesting representative subset of frequent patterns. Temporal periodicity...
متن کامل